27 research outputs found
Analysis of a Collaborative Filter Based on Popularity Amongst Neighbors
In this paper, we analyze a collaborative filter that answers the simple
question: What is popular amongst your friends? While this basic principle
seems to be prevalent in many practical implementations, there does not appear
to be much theoretical analysis of its performance. In this paper, we partly
fill this gap. While recent works on this topic, such as the low-rank matrix
completion literature, consider the probability of error in recovering the
entire rating matrix, we consider probability of an error in an individual
recommendation (bit error rate (BER)). For a mathematical model introduced in
[1],[2], we identify three regimes of operation for our algorithm (named
Popularity Amongst Friends (PAF)) in the limit as the matrix size grows to
infinity. In a regime characterized by large number of samples and small
degrees of freedom (defined precisely for the model in the paper), the
asymptotic BER is zero; in a regime characterized by large number of samples
and large degrees of freedom, the asymptotic BER is bounded away from 0 and 1/2
(and is identified exactly except for a special case); and in a regime
characterized by a small number of samples, the algorithm fails. We also
present numerical results for the MovieLens and Netflix datasets. We discuss
the empirical performance in light of our theoretical results and compare with
an approach based on low-rank matrix completion.Comment: 47 pages. Submitted to IEEE Transactions on Information Theory
(revised in July 2011). A shorter version would be presented at ISIT 201
Consistent Signal Parameter Estimation with 1-Bit Dithered Sampling
Publication in the conference proceedings of EUSIPCO, Florence, Italy, 200
A Channel Coding Perspective of Collaborative Filtering
We consider the problem of collaborative filtering from a channel coding
perspective. We model the underlying rating matrix as a finite alphabet matrix
with block constant structure. The observations are obtained from this
underlying matrix through a discrete memoryless channel with a noisy part
representing noisy user behavior and an erasure part representing missing data.
Moreover, the clusters over which the underlying matrix is constant are {\it
unknown}. We establish a sharp threshold result for this model: if the largest
cluster size is smaller than (where the rating matrix is of size
), then the underlying matrix cannot be recovered with any
estimator, but if the smallest cluster size is larger than , then
we show a polynomial time estimator with diminishing probability of error. In
the case of uniform cluster size, not only the order of the threshold, but also
the constant is identified.Comment: 32 pages, 1 figure, Submitted to IEEE Transactions on Information
Theor
A Novel Conflict-Free Memory and Processor Architecture for DVB-T2 LDPC Decoding
In this paper, we present a flexible architecture for an LDPC decoder that fully exploits the structure of the codes defined in the DVB-T2 standard (Digital Video Broadcasting - Second Generation Terrestrial). We propose a processor and memory architecture which uses the flooding schedule and has no memory access conflicts, which are encountered in serial schedule decoders proposed in the literature. Thus, unlike previous works, we do not require any extra logic or ad hoc designs to resolve memory conflicts. Despite the typically slower convergence of flooding schedule compared to serial schedule decoders, our ar- chitecture meets the throughput and BER requirements specified in the DVB-T2 standard. Our design allows a trade-off between memory size and performance by the selection of the number of bits per message without affecting the general memory arrangement. Besides, our architecture is not algorithm specific: any check-node message processing algorithm can be used (Sum-Product, Min-Sum, etc.) without modifying the basic architecture. Furthermore, by simply adding relevant small ROM tables, we get a decoder that is fully compatible with all three second generation DVB standards (DVB-T2, DVB-S2 and DVB-C2). We present simulation results to demonstrate the viability of our solution both functionally and in terms of the bit-error rate performance. We also discuss the memory requirements and the throughput of the architecture, and present preliminary synthesis results in CMOS 130nm technology
WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation
Visual anomaly classification and segmentation are vital for automating
industrial quality inspection. The focus of prior research in the field has
been on training custom models for each quality inspection task, which requires
task-specific images and annotation. In this paper we move away from this
regime, addressing zero-shot and few-normal-shot anomaly classification and
segmentation. Recently CLIP, a vision-language model, has shown revolutionary
generality with competitive zero-/few-shot performance in comparison to
full-supervision. But CLIP falls short on anomaly classification and
segmentation tasks. Hence, we propose window-based CLIP (WinCLIP) with (1) a
compositional ensemble on state words and prompt templates and (2) efficient
extraction and aggregation of window/patch/image-level features aligned with
text. We also propose its few-normal-shot extension WinCLIP+, which uses
complementary information from normal images. In MVTec-AD (and VisA), without
further tuning, WinCLIP achieves 91.8%/85.1% (78.1%/79.6%) AUROC in zero-shot
anomaly classification and segmentation while WinCLIP+ does 93.1%/95.2%
(83.8%/96.4%) in 1-normal-shot, surpassing state-of-the-art by large margins.Comment: Accepted to Conference on Computer Vision and Pattern Recognition
(CVPR) 202
Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark
Most existing works on few-shot object detection (FSOD) focus on a setting
where both pre-training and few-shot learning datasets are from a similar
domain. However, few-shot algorithms are important in multiple domains; hence
evaluation needs to reflect the broad applications. We propose a Multi-dOmain
Few-Shot Object Detection (MoFSOD) benchmark consisting of 10 datasets from a
wide range of domains to evaluate FSOD algorithms. We comprehensively analyze
the impacts of freezing layers, different architectures, and different
pre-training datasets on FSOD performance. Our empirical results show several
key factors that have not been explored in previous works: 1) contrary to
previous belief, on a multi-domain benchmark, fine-tuning (FT) is a strong
baseline for FSOD, performing on par or better than the state-of-the-art (SOTA)
algorithms; 2) utilizing FT as the baseline allows us to explore multiple
architectures, and we found them to have a significant impact on down-stream
few-shot tasks, even with similar pre-training performances; 3) by decoupling
pre-training and few-shot learning, MoFSOD allows us to explore the impact of
different pre-training datasets, and the right choice can boost the performance
of the down-stream tasks significantly. Based on these findings, we list
possible avenues of investigation for improving FSOD performance and propose
two simple modifications to existing algorithms that lead to SOTA performance
on the MoFSOD benchmark. The code is available at
https://github.com/amazon-research/few-shot-object-detection-benchmark.Comment: Accepted at ECCV 202
Can We Improve Over Weber Sampling of Haptic Signals?
Abstract — In applications such as telesurgery, it is required to transmit haptic signals to a remote location with a delay of at most few milliseconds. To reduce the packet rate and yet retain perceptual quality, adaptive sampling has been explored in the literature. In particular, in earlier work we proposed and analyzed an adaptive sampling scheme based on Weber’s law of perception. In this paper, we explore other possible adaptive sampling candidates. We describe an experimental setup where users are subjected to piecewise constant haptic stimuli to which they can respond with a click. We record the clicks and ask the question: can we identify signal features and classiers to predict the clicks? The answer suggests adaptive sampling schemes that improve over Weber sampling. I